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. Abstract 

We study on which classes of graphs first-order logic (fo) and monadic second-order logic (mso) 
I have the same expressive power. We show that for all classes C of graphs that are closed under taking 

^ 1 ■ subgraphs, fo and mso have the same expressive power on C if, and only if, C has bounded tree depth. 

Tree depth is a graph invariant that measures the similarity of a graph to a star in a similar way that 
Q ■ tree width measures the similarity of a graph to a tree. For classes just closed under taking induced 

subgraphs, we show an analogous result for guarded second-order logic (gso), the variant of mso 
that not only allows quantification over vertex sets but also over edge sets. A key tool in our proof is 
a Feferman-Vaught-type theorem that is constructive and still works for unbounded partitions. 

Keywords: first-order logic, monadic second-order logic, guarded second-order logic, tree depth, 
graph classes 

\q 

^ '. 1 Introduction 
O 

(N 



First-order logic (FO) and monadic second-order logic (MSO) are arguably among the most important 
logics studied in computer science, partly because of their tight links to finite automata and regular 
languages. It is well-known that MSO is strictly more expressive than FO, indeed, the difference in the 
expressive power of the two logics manifests already on finite words: The MSO-definable classes of words 
. are precisely the regular languages [i2l|6l[T8l, whereas the FO-definable classes are the star-free regular 

languages lfT3l[T6]| . This implies, for example, that not even the class of all finite words of even length is 
FO-definable. 

In this paper, we study the question on which classes of structures MSO and FO have the same expres- 
sive power, with the focus lying on graph classes. Monadic second-order logic on graphs is commonly 
studied in two different versions: the first only allows quantification over vertex sets, whereas the second 
allows quantification over both vertex sets and edge sets. From now on, we use MSO to refer to the first 
version (with quantification over vertex sets only) and refer to the second version as guarded second- 
order logic (GSofl. It is obvious that there are classes of graphs where the three logics are equally 
expressive: all finite classes are examples, but it is also easy to construct infinite classes. Indeed, the 
work of Dawar and Hella [4| implies that every infinite class of graphs has an infinite subclass on which 



'Another common terminology is to write MSOi instead of MSO and MSO2 instead of GSO. 
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the sentences from FO, MSO, and GSO have the same expressive power; this can be proved using diago- 
nahzation arguments that lead to completely artificial classes. Our main contribution in the present paper 
is the characterization of natural classes of graphs on which the three logics are equally expressive. We 
show that all classes of graphs of bounded tree depth have this property and, furthermore, under natural 
closure conditions on the classes this is optimal: no other classes have this property. 

Let us explain our results in more detail: Tree depth is a graph invariant that was introduced by Nesetfil 
and Ossona de Mendez in 1 14| and has turned out to be useful in various algorithmic applications. While 
the definition of tree depth is fairly technical, there are two intuitive characterisations of graph classes of 
bounded tree depth: First, a class C of graphs has bounded tree depth if, and only if, the graphs in C have 
tree decompositions of bounded width where the decomposition trees have bounded height. Intuitively, 
this characterisation shows that tree depth measures the similarity of graphs to stars, whereas tree width 
measures their similarity to trees. The second characterisation states that a class C has bounded tree 
depth if, and only if, there is an upper bound on the lengths of simple paths in the graphs of C. (Note 
that this implies that the graphs have bounded diameter, but the two conditions are not equivalent, as the 
example of the class of complete graphs shows: all graphs in this class have diameter 1, but they still 
contain arbitrarily long paths.) 

Theorem 1. Let C be a class of graphs of bounded tree depth. Then FO, MSO, and GSO have the same 
expressive power on C. 

In | 5 1 it is shown that all GSO-definable decision problems on graphs of bounded tree depth are in 
the complexity class AC°. One might wonder whether Theorem [T] is not already implied by this result 
in view of the well-known descriptive characterisation of AC" by FO due to Barrington, Immerman and 
Straubing |[T]|. It is the other way round, however: Our theorem is significantly stronger, because the 
characterisation of AC° requires FO with built-in arithmetic. In our context, this makes a big difference 
when it comes to defining tree decompositions (on which we could then simulate an automaton corre- 
sponding to a given GSO-sentence): without having at least a built-in order, there is no hope of defining 
decompositions because in general they are not invariant under automorphisms of the underlying graph, 
and only automorphism-invariant objects are definable. One approach to resolving this issue without a 
built-in order would be to use the treelike decompositions of |i9i 10], but we take a different route that 
avoids the explicit construction of decompositions within the logic altogether. For our proof we develop a 
constructive Feferman-Vaught-type composition theorem that shows how to first-order reduce the evalu- 
ation of a GSO-formula on a structure to the evaluation of GSO-formulas on an unbounded, even infinite, 
number of substructures. Using this theorem and the characterization of tree depth in terms of a bounded 
number of parallel vertex eliminations, we are able to evaluate GSO-formulas on graphs of bounded tree 
depth using first-order formulas. 

Theorem [T] prompts the question of whether there are classes of unbounded tree depth on which FO 
has the same expressive power as MSO or GSO. Indeed there are such classes since, as noted above, 
there is a (highly artificial) infinite class of paths where the logics coincide; and by the second of the 
above characterisations of tree depth an infinite class of paths has unbounded tree depth. However, when 
looking for classes of structures satisfying natural closure conditions, it turns out that for classes closed 
under taking subgraphs. Theorem [T] is optimal. 

Theorem 2. Let C be a class of graphs closed under taking subgraphs. Then the following three state- 
ments are equivalent: 

L FO and MSO have the same expressive power on C. 

2. FO and GSO have the same expressive power on C. 

3. C has bounded tree depth. 

This follows immediately from Theorem [TJ because a class of unbounded tree depth that is closed 
under taking subgraphs contains all paths, and MSO is strictly more expressive than FO on the class of 
all paths. For example, "even cardinality" is expressible in MSO, but not in FO on the class of paths. 
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A weaker closure condition we consider next is closure under taking induced subgraphs. (Remember 
that a subgraph of a graph is obtained by arbitrarily deleting vertices and edges, whereas an induced 
subgraph is obtained by only deleting vertices and the edges incident with these vertices.) Theorem [2] 
does not extend to all classes closed under taking induced subgraphs as the class of all complete graphs 
shows: It is closed under taking induced subgraphs, it has unbounded tree depth, and FO and MSO have 
the same expressive power on it. However, for GSO the result can be extended to classes closed under 
taking induced subgraphs. 

Theorem 3. Let C be a class of graphs closed under taking induced subgraphs. Then the following 
statements are equivalent: 

1. FO and GSO have the same expressive power on C. 

2. C has bounded tree depth. 

As opposed to Theorem [2l this theorem is not an immediate consequence of Theorem [T] The proof 
of the forward direction relies on a Ramsey-type lemma stating that for every k there is an n such that 
every graph that contains a path of length n (not necessarily as an induced subgraph) contains either a 
complete graph with k vertices or a complete bipartite graph with k vertices in each shore or a path of 
length k as an induced subgraph. This lemma may be of independent interest. 

Let us finally remark that it cannot be taken for granted that every logic coincides with first-order logic 
on some natural, infinite classes of graphs. For instance the analogue to Theorem[3]for full second-order 
logic states that for all classes C of graphs closed under taking induced subgraphs, FO and SO have the 
same expressive power on C if, and only if, C is finite (up to isomorphism). This follows from Ramsey's 
theorem, which implies that every infinite class of graphs closed under taking induced subgraphs either 
contains the class of all complete graphs or the class of all graphs with no edges and that on both of these 
graph classes SO is strictly more expressive than FO. 

Related Work. The expressive power of both first-order logic and monadic second-order logic has been 
extensively studied in various contexts. On words, various automata theoretic and algebraic character- 
isations are known (see ifTTll ). The monadic second-order logic of graphs and in particular the relation 
between MSO and GSO on various graph classes has been intensively studied by Courcelle and his col- 
laborators (see f31). However, to the best of our knowledge the simple question we study here has not 
been addressed in the literature. 

Organization of This Paper. In Section [2] we review the used logics and describe conventions used 
throughout the paper. Section[3]contains the statement and proof of the composition theorem that is used 
in Section|4]to prove Theorem[T] Finally, Theorem[3]is proven in Section[5l 

2 Review of First-Order and Second-Order Logics 

In the present section we fix the basic terminology and review the logics FO, MSO, and GSO. Concerning 
MSO and GSO, we review the definition of types as used in IfTTll and a standard construction that shows 
how we can restrict attention to second-order variables only. 

In the present paper, all vocabularies T are finite and purely relational, that is, we do not consider 
constant or function symbols. We write /? € T to indicate that /? is a relation symbol in T and G T to 
additionally indicate that /?'s arity is r. A relation symbol of arity 1 is called monadic. A structure S 
over a vocabulary T consists of a universe S and one subset C S'' for each R' € T. A structure 5 is a 
substructure of a structure S' over the same vocabulary if 5 C 5' and for each G T we have R^ (ZR^ . 
We say that S is an induced substructure if, in addition, for all /?'' G T we have R^ = R^ r\S^ . Given two 
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structures S and T over the same vocabulary T, their union has universe S(JT and for every /? G T, we 

Given an arbitrary r-ary relation /? C 5'' on a universe S, we define the Gaifman graph of R as the 
undirected graph whose vertex set is S and where there is an edge between two distinct vertices m, v G 5 if, 
and only if, there is a tuple {s\,. .. ,Sr) GR with u = Si and v = sj for some /, 7 G { 1 , . . . , r} . The Gaifman 
graph of a structure S is the union of all Gaifman graphs of relations 7?*^ for 7? G T. An arbitrary relation 
7? on 5 is called guarded in S if the Gaifman graph of /? is a subgraph of the Gaifman graph of S. Note 
that a monadic relation is automatically guarded, because its Gaifman graph contains no edges. 

We denote first-order and second-order variables by lowercase and uppercase Latin letters, respec- 
tively. For a second-order variable X, its arity is a number r{X) G N. A variable assignment for a struc- 
ture 5 is a mapping a whose domain is a finite set of first- and second-order variables that maps each 
first-order variable to an element of a{x) G 5 and each second-order variable X to a subset a(X) C S''^^\ 

Given a vocabulary T, the first-order formulas over T are defined inductively in the usual way; we 
just remark that we consider x = 3^ to be an atomic formula, so equality is always available. We only 
consider the existential quantifier 3, the conjunction symbol A, and the negation symbol to be part of 
the formal syntax. Free and bound variables of a formula are defined in the usual way. The set of all 
first-order formulas over a vocabulary T is denoted by FO[t]. Similarly, we define SO[t] as the set of all 
second-order formulas also in the usual way. Again, only the existential quantifier is formally part of the 
syntax. We use lowercase Greek letters like (p and xj/ for first-order formulas and uppercase Greek letters 
like <I> and *P for second-order formulas. 

Given a structure S, a formula 0, and a variable assignment a that assigns a value to every free 
variable of <I>, we write {S,a) ^ <I> to indicate that {S,a) is a model of 0, where the modeling relation 
is defined in the usual way. We write Mod(<I>) for the set of all pairs {S,a) with {S,a) \= <I>. Assuming 
that has exactly the free variables x\ to x„ and Xi to X^ and assuming that a{xi) =ai G S and a{Xi) = 
Aj C ^''W, we also write S \= <I>(ai, . . . ,a„,Ai, . . . ,A,„) instead of {S,a) \= <t>. 

Two restrictions of second-order logic will be of particular interest. The first is monadic second-order 
logic, which is defined by restricting the syntax of second-order formulas: the class MSO[t] contains all 
formulas from SO[t] where all second-order variables are monadic (have arity 1). Second, we consider 
guarded second-order logic HI, which is defined by restricting the semantics of second-order logic: the 
class GSO[t] is exactly SO[t], but the semantics is restricted by allowing only guarded relations to be 
assigned to relational variables (bound or free) in the inductive definition of the semantics of composed 
formulas. For graph structures, MSO is sometimes denoted by MSOi and GSO by MSO2. 

Second-Order Formulas without First-Order Variables. It will be convenient to consider only second- 
order formulas that do not contain any first-order variables (free or bound). For this purpose, it will 
be necessary to introduce two new atomic formulas: First, empty(X) is an atomic formula for every 
monadic second-order variable X and semantically (5, a) \= empty(a(X)) means a{X) = 0. Second, 
elem(Fi, . . . ,y^,Z) is an atomic formula, where the F, are monadic second-order variables and Z is ei- 
ther an r-ary relation symbol from T or it is an r-ary second-order variable. Semantically, {S,a) \= 
elem(Fi, . . . ,F,-,Z) means |a(F,)| = 1 for each / G {1, . . . ,r}, and a{Y\) x • • • x aijr) C when Z is a 
relation symbol and a{Y]) x • • • x aiJr) C a{Z) when Z is a second-order variable. 

These new atomic formulas can be used to transform any MSO (or GSO) formula with first-order vari- 
ables into an equivalent MSO (or GSO) formula without first-order variables: First, for every first-order 
variable x we introduce a fresh monadic second-order variable X. Second, replace every occurrence of 
the atom x = yhy the formula elem (X , F ) A elem (F, X ) . Third, replace every occurrence of Z (xi , . . . , x^ ) , 
where Z is either a relation symbol or a second-order variable, by elem(Xi, . . . ,X,.,Z). Finally, replace 
each quantification 3x(<I>) by the formula 3X(elem(X,X) AO), expressing thatX is a singleton set and 
holds. In the following, we will assume that second-order formulas do not contain first-order variables. 
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Types. The quantifier rank qr(<I>) of a formula <I> is defined in the usual way as the nesting depth of 
quantifiers (not necessarily alternating) in the formula; for instance qr(3X3F/?(X, F)) is 2. Let GSOk,q [l] 
be the set of second-order formulas <I> whose free variables lie in {Xy and for which qr(<I>) < q. 
The set MSOj:^^[t] is define analogously, only for monadic formulas. 

We say that two formulas <I> and *F are equivalent, written <I> = if Mod(<I>) = Mod(*P). For 
a set F of formulas, let us write F/= for the set of all equivalence classes of F with respect to the 
equivalence relation =. The following fact can be proven by defining normal forms for the formulas of 
the corresponding logic, see for example 1 1 1 1 : 

Fact 4. For every finite vocabulary x and for every k and q the sets MSOk.q['^]/= and GSO<:.g[T]/= 
contain only finitely many equivalence classes. 

Definition 5 (GSO- and MSO-Types). For a structure S and a guarded variable assignment a with domain 
{Xi,. . . ,Xic}, we call the set of all formulas € GSOyt ,j[T] with (<S,a) |= <I> the k-variable rank-q GSO- 
type of{S,a); we denote it by type^^°(5,fl'). The definition of MSO-types is analogous. 

By FactlHthere are only finitely many diff'erent /[-variable rank-17 types (for every fixed vocabulary t) 
since there are only finitely many non-equivalent formulas in MSOt^q [t] and GSO^^^ [z] . Thus, we can 
view these types also as symbols of finite alphabets r^^° and An additional consequence of Fact|4] 
is that we can describe types by formulas: 

Lemma 6. Let L be MSO or GSO. For every type T € there is a formula Ay G Li;^^[T] such that 
for every z-structure S and every guarded variable assignment a we have {S,a) \= At if and only if 
T = type\q{S,a). 

Proof. By Fact|4]there is a finite representative system /? of L^.,^[t]/=. Let Ay = Anesnr^'^ Anefi\r 
Clearly, Aj is true exactly if the (representative) formulas from T are true in (5, a) and the (representative) 
formulas not in T are false. By definition, this is exactly the case when T = type^^(5,fl;). □ 

For two logics Li and L2 (like FO and MSO or FO and GSO) and a class C of structures, we say that L2 
is at least as expressive as Li on C (and write Li L2) if for every Li-sentence (pi there is an L2-sentence 
(f>2 such that ModLi (<Pi) flC = ModL2(<?>2) nC We say that Li and L2 are equally expressive on C (and 
write Li =c L2) if Li L2 and L2 Li- 

3 A Constructive Feferman-Vaught-type Composition Theorem 
for Unbounded Partitions 

The question answered by Feferman-Vaught-type composition theorems is the following: Suppose a log- 
ical structure S is the disjoint union of two structures Si and ^2 and suppose we wish to find out whether 
holds; can we decide this solely based on knowing which formulas hold in Si and ^2? Intuitively, 
this should be the case at least for logics like monadic second-order logic where a formula cannot "estab- 
lish connections" between the two disjoint parts of S. Indeed, a basic version of the Feferman-Vaught 
theorem [ 12, 7| states exactly this: For every formula € MSOk^q we can decide 5 ^ <I> solely based on 
knowing which formulas *P G MSO^:^^ have the property 5i ^ *P and which have the property ^2 |= 
Phrased in terms of /^-variable rank-^ MSO-types, the theorem states that the type of S is uniquely deter- 
mined by the types of Si and ^2. An elegant proof of this uses that the variable rank-<7 Mso-type of 
a structure is uniquely determined by which Ehrenfeucht-Fraisse game strategies can be played on the 
structure. Since strategies for the individual structures Si and 52 can be combined into a strategy for the 
structure S, we get the claim. 

The basic version of the Feferman-Vaught theorem can be extended in several ways. First, instead 
of considering only two structures, one can consider an unbounded, even infinite, number of structures; 
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the proof based on Ehrenfeucht-Fraisse games will still work. Second, one can make explicit how we 
can compute the type of S when the types of S\ and 52 are given as input. Third, one can allow that the 
structures are not completely disjoint, but have a fixed-size intersection. 

The first two directions of extension, "unbounded" and "constructive", appear to be quite incompat- 
ible at first sight. Constructive Feferman-Vaught theorems for MSO roughly state that for each formula 
€ MSOk.q one can construct a propositional formula F that has two propositional variables p\, and 
for one representative *P of each equivalence class [*P]= G MSO^: ,j/=. When we set these propositional 
variables to true or false, depending on whether the formulas m hold in S\ and ^2, the formula F will 
evaluate to true if, and only if, 5 ^ <I>. Clearly, one can extend this idea to any fixed number of structures 
Si, . . . , Skhy introducing new propositional variables p\, to (as done, for instance, in fVT\), but the 
construction will not work for an unbounded number of structures, let alone for an infinite number. 

In the present section, we present a theorem that can be seen as a "constructive, unbounded" Feferman- 
Vaught-type theorem. The idea is to \x?<q first-order formulas rather than propositional formulas in order 
to "evaluate" whether a formula holds in a structure S that is the disjoint union of an arbitrary number 
of structures Sj for / € / (actually, we allow that the structures have a fixed-size intersection). Instead of 
having to introduce new propositional variables as the number of structures increases, we simply enlarge 
the universe: We consider a structure Z whose universe is the index set /. Instead of using propositional 
variables p'^ inside a propositional formula F, we now use atomic formulas T<^{i) inside a first-order 
formula a where is a monadic relation symbol that "tells us whether holds in the structure S". 
(Actually, we use types instead of formulas, but this is purely a matter of taste.) The result is a first-order 
formula OCo that "takes a structure as input that encodes which formulas hold in the structures S" and 
"outputs whether 5 ^ <I> holds". 

Our composition theorem encompasses both the classical Feferman-Vaught theorems for infinite 
index sets and the constructive versions for a fixed number of structures as special cases: The classical 
infinite version simply states that there is some mapping from the types of the structures <S, to the type 
iS; we show that this mapping is first-order definable. For a fixed-size index set /, we obtain the bounded 
version by reformulating the question T\= (X<i, for the fixed-size structure Z using propositional logic. 

Concerning proof techniques, the main problem in proving constructive composition theorems is 
the handling of existentially quantified formulas 3X(<I>). When indicator variables or atoms tell us that 
3X{^i) and 3X{^2) hold in some structure Si, two different assignments for the variable X might be the 
cause. This makes it necessary to combine the information concerning the types of the structures Sj in 
rather intricate ways. For the case of a bounded number of structures, one typically computes disjunctive 
normal forms of intermediate propositional formulas in an inductive process and then combines these 
normal forms to form a new formula (a detailed proof of this kind is given in |31). We cannot apply 
this "normal form method" since it fails when the number of structures is not fixed. Our approach is, 
essentially, to ignore the problem of "conflicting" assignments and to use the fact that the type indicators 
are such simple structures that first-order formulas have rather special model theoretic properties on them. 

In later sections we will apply our composition theorem only to structures of bounded tree depth. 
Nevertheless, it holds for arbitrary structures, which can have arbitrary tree depth. 

3.1 Indicator Structures and Type Indicators 

In order to formulate our composition theorem, we first need to define a logical structure that encodes 
information about the types of logical structures 5, for / € /. Toward this aim, we first introduce indicator 
structures and later type indicators. Indicator structures are akin to strings, but there is no ordering. 

Definition 7 (Indicator Structure). Let £ be an alphabet (a finite nonempty set). Let Xz be the vocabulary 
that contains one monadic relation symbol € Tj; for each T £L. An indicator structure is a T^-structure 
Z such that for each / € / there is exactly one 7 G £ with Z\=T (/) . 
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The following lemma will be a crucial technical tool in the proof of our composition theorem. It 
states, essentially, that for every first-order formula a describing a set of indicator structures over a fixed 
universe we can find a "well-behaved" first-order formula jSa that describes the same set of indicator 
structures, but whose class of models enjoys a number of closure properties, namely being "closed under 
universe-preserving extensions" and its minimal models all being indicator structures. 

Lemma 8. Lef £ be an alphabet. For every first-order Zz-formula (X there is first-order ZY.-formula pa 
such that for every Tz-structure B the following holds: B \= pa if only if, there is a structure A\= <x 
that is (a) an indicator structure, (b) a substructure ofB, and (c) A= B. 

Proof. Let £ = {Ti, . . . ,Tt}. Let B be an arbitrary indicator structure over the vocabulary Zz- Observe 
that since Tj: does not contain any non-monadic relation symbols, the elements of the universe B ofB can 
only be considered "in isolation" by a first-order formula. More formally, let cj denote the cardinality 
of {i ^B \ B\= Tj{i)}. Then whether B \= (p holds for some first-order formula (p, can depend only on 
the value of the number vector (ci, . . . ,Cf) G N^ Using Ehrenfeucht-Fraisse games, one can prove (see 
ifTTl Exercise IV. 3. 2] for a detailed argument) that for every a there is a constant C € N such that for 
the "capped cardinalities" c'j = min{cj,C} we have B \= (p ii, and only if, (cj, . . . ,cj) E Z for some fixed 
set Z C {0, . . . ,C}' of number vectors. For a number vector z = (zi, . . . ,z,) € {0, . . . ,€}' let us define 
a formula j8j that "tests" whether the "capped cardinalities" of a structure B are exactly z.. It has the 
following form: 

3/i . . . 3in ((Pdistmct(ii , . . . , /„) A r> (/i) A • • • A r"(/„) A 

Vj [<Pdistinct(/l , . . . , inJ) ^ {T"+\j) V • • • V T'"{j))]) . 

Here, <Pdistinct is a standard formula for expressing that elements are distinct. The symbols T ^ to T" are 
chosen from £ in such a way that exactly zi of them are Ti, exactly Z2 of them are T2, and so on; thus 
n = Zi- The symbols r"+i to T'" are exactly those Tj for which zj = C. To see that this construction 
is correct, just note that the formula expresses "there are indices /i to /„ where the cardinalities of the 
symbols are exactly as prescribed by z and at all other indices the symbol is one of the capped symbols". 

We claim that setting pa = y^ezl^z yields the sought formula pa- By the above arguments, Pa and a 
have exactly the same models when we restrict attention to indicator structures. To show that Pa has the 
claimed properties, we argue as follows: For the only-if-part, suppose B \= pa- Then, by construction, 
B \= j8- holds for some z € Z and, thus, there is an indicator structure A\= a that is a substructure of B 
and has the same universe. For the if-part, consider an indicator structure A that is a model of a. Then, 
it is also a model of pa and by the monotonicity of pa, every extension of A over the same universe is 
also a model of j3a. □ 

Recall that r^^° and £^^°, which contain the A:- variable rank-q types for the two different logics, are 
finite alphabets. In particular, we can use them as alphabets for an indicator structure, but let us write 
Tff and Tff for t^mso and Tyoso. 

Definition 9 (gso- and Mso-Type Indicators). Let q and k be fixed. Let / be an index set (not necessarily 
finite) and let F = {Si)i^i be a family of structures. Let U = {J^^jSi. Let a map each variable in X G 
{X\ ,Xk} to a subset a{X) C V^^^ and let a;(X) = a{X) n 5^'^^ be its restriction to the universe of 
Sj. The Gso-type indicator is the T^^°-structure Zf^°{F,a) with universe / where for each type symbol 

T € we have T^''-i = {/ E / | T = type^^°(5,,fl:,)}. The definition for Mso-fype indicators is 
analogous. 

Both GSO- and Mso-type indicators encode a lot of information concerning the structures 5, by en- 
coding their types. In particular, we can use them to find out whether a given formula <I> holds in some 5,. 
The below lemma follows trivially from the definition. 
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Definition 10. Let L E {GSO,MSO}. For <I> G U- ^[t] let ^^{i) = Wrez]; lt].<PeT ^(O- 

Lemma 11. Let L € {GSO,MSO}. For every family F = every variable assignment a, every 

*J* G L/t,^[T], and every i G /, we have {Si,ai) if, and only if, Z\^{F,a) \= 7^(/). 

3.2 Formulation and Proof of the Composition Theorem 

Definition 12 (Rooted Structure). Let w > denote a width. A width-w rooted structure is a logical 
structure S over a vocabulary T in which there are special monadic relation symbols Bi to B„. such that 
for each / G {1, . . . ,w} the set5f is a singleton (has exactly one element). We say that S(5) = ULi^f 
is the bag ofS. 

Definition 13 (Rooted Partition). Let 5 be a rooted T-structure. A rooted partition of 5 is a family 
{Si)iei of T-structures such that the following holds: (a) The union of all 5,- is exactly S; (b) each Si is 
an induced substructure of S; and (c) for all distinct /, j ^ I we have Sj n Sj = B{S) . 

Note that in a width-0 rooted partition of S, the structure S is the disjoint union of the Sj. 

Theorem 14 (Composition Theorem). Let L be the logic MSO or GSO. For every x-formula <I> G L/^ q [t] 

and every width w, there is a first-order x^^-formula a^^w without free variables such that the following 
holds: For every rooted partition F = {Si)i,=j of a width-w rooted T-structure S and every guarded 
variable assignment a we have 

X\q{F,a) \= OCtp^w if and only if, (5,<3) |= <I>. 

Proof. The proof is by induction on the structure of We start with the atomic formulas. Since w is 
fixed throughout the proof, we write instead of o;^ 

For = empty(X) we can set 0$ to V/()^pjy^^^(/)) where yis the formula from Definition [TOl By 

Lemma fm V/()^pjyjj5fj(/)) \= T^^{F,a) means that for all / G / we have {Si,ai) \= empty(X). Clearly, 
this is the case if, and only if, {S,a) \= empty(X). 

For <I> = elem(Xi , . . . ,Xr,R) with G T, we set a<j> to 

3i(Y<s>ii)Ayj{j^i^ 

A 7empty(A:„) U) V /VfLi elem(X„„fi<) U) ) ) • 
m=l 

For the correctness proof first assume that {S,a) \= <I> holds. Then we know = 1 for each m G 

{1, . . . ,r} anda(Xi) x • • • x a{Xr) Q R'^ . Since 5 is the union of the 5,-, wehavea(Xi) x • • • x a{Xr) QR'^' 
for some 5,-. Moreover, each singleton a{Xm) is either part of the bag B{S), and we have a{Xm) C B^' 
for some £ G {1, . . . , w} and all Sj, or it is not part of the bag, and a{Xyn) % Sj holds for all j / /. 
For the other direction assume Z\^{F,a) |= (X^. The formula witnesses that there exists some / with 
{Si,ai) 1= elem(Xi, ... ,X,„,/?) and for all other 5y and sets fl;(X,„), we have 5^ na(X,„) CB{S). From the 
definition of rooted partitions, we know B{S) C Si, which implies a{Xm) C Si for each m G {1, . . . ,r}. 
Thus, (5, a) 1= elem(Xi, . . . ,X,n,R) follows from (5,-, a,) \= elem(Xi, . . . ,X,n,R). 

For <I> = elem(Xi , . . . ,Xr,Z), where Z is an r-ary second-order variable, the formula and correctness 
arguments are the same, except that we work with a guarded relation a{Z) that is assigned to Z instead 
of a relation R'^ from the structure. 

For the inductive step, we start with = -lO'. Here we can set = ^a^i. Clearly, this has the 
required properties. Similarly, for <I> = <I>i A<I>2, setting = A also has the desired properties. 

The difficult case is <I> = 3X{^'). Let a,^' be the Ti:^_ i -formula resulting from the inductive assump- 
tion. We apply Lemma[8]to a^i, resulting in a formula j3ce , , which we abbreviate as p in the following. 
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Recall that /3 has the following properties: A structure S is a model of j8 if, and only if, there a structure 
A \= a^i that is (a) an indicator structure, (b) a substructure of B, and (c) has the same universe as B. 

Let bj denote the single element of By for 7 G {1, . . . , w}. Detine = Vccb(5)' where each ac 
is obtained from jS as follows: In j8, replace every occurrence of an atom T{i) for some type T G 
and some first-order variable / by 7^ (/) with 

^'=3x(atA /\ elem{Bj^,...,Bj^,X)A 

(bj^,....bj,.)eB{SY\c ^ 

Here, Ay is the formula from Lemma |6] expressing that T is the type of some structure. 

Before we proceed to prove that defined in this way satisfies the equivalence claimed in the theo- 
rem, let us try to get some intuition. Ignoring C for the moment (let us just assume that B(S) is empty), 
*P states "Can we set X to some relation R that is guarded in 5,- for which the type of S\ is exactly T?" 
This means that when we replace an occurrence of T(i) by '^{i), we turn the question "Is it true that T is 
the type of {Si,a)l" into the question "Is is true that T is the type of {Si,a[X 1— R]) for some set 7??" (Let 
a[X R] denote the variable assignment that is identical to a, except that X is mapped to R.) We now 
see that "almost" tests whether <I> holds in {S,a[X 1-^ /?]). The problem is that for each replacement 
of some T{i) by a different set R might cause T{i) to hold, while we need a "global" R that can 
be used as a value for a{X). This is the point where the set C and the special properties of j3 become 
important: The set C ensures that all chosen R agree on the bag across all replacements. The special 
properties of j3 will ensure that we can pick a single R consistently. 

Claim. Fix the set C. Define a ^-structure Tc (which will typically not be an indicator structure) 
with universe I as follows: Let i G if there exists a relation R Q S'^ that is guarded in Si, for which 
RnB{SY = C, and such that T = type^^_j {Si,a[X ^ R]). Then 

ll^iF,a)^ac ^TchP- (*) 

Proof. In the formula Oc, each occurrence of an atom T{i) has been replaced by 7^(i). By definition, 
T{i) holds in 7c if, and only if, there is a relation R guarded in 5,- with R n B{S)'' = C such that T is the 
^-variable rank- — 1 ) type of (5; , a [X 1— )■ /?] ) . However, having a look at the definition of *F, we see that 
)^(/) will be true exactly if this is the case. □ 

Let us now prove the equivalence claimed in the theorem. First assume that (5, a) \= 0. Then there is 
a set /?C 5"' guarded in 5 such that (5, a [XiH-/?]) \=<t>'. hetC = RnB{SY. By the induction hypothesis, 
-^k'q- 1 (^5 ~^ ^] ) N • Observe that j (F, a[X R]) has the following three properties: (a) It is 
an indicator structure since all type indicators are indicator structures, (b) it is a substructure of Tc, and 
(c) it has the same universe / as 7c- By Lemma[8]we can conclude that Tc is a. model of jS. By (tU, this 
implies Z\^{F,a) \= Oc which in turn implies T\^{F,a) |= 

For the second direction, assume that X^^{F,a) \= holds. Then T\^{F,a) \= Oc must hold for 
some C. By (Q), this means that Tc \= By Lemma [8l we can conclude that must have a model 
A that is (a) an indicator structure, (b) is a substructure of 7c, and (c) has the universe /. However, 
(a) and (c) together imply that ^ is a type indicator. Together with (b) and the definition of 7c, we 
can now conclude that for every / G / there is a relation C 5^ guarded in Sj with Rj r\B{SY = C 
and T = type^^_j(5,-,a[X 1— )• /?,]). Setting R = {ji^jRj, we get a single guarded relation R such that 
I^^j{F,a[X I— )• /?]) = A. Hence, X^^(F,a[X 1— )•/?]) ^ a^i. Applying the inductive assumption yields 
{S,a[X R]) \= a<i>i, from which we can directly conclude {S,a)\= oup. □ 
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4 FO, MSO and GSO Coincide on Graphs of Bounded Tree Depth 



In the present section we prove Theorem[T]from the introduction. The tree depth of graphs can be defined 
recursively as follows |[T4]| . where G[U] is the subgraph of G induced on the nodes in U : 

Definition 15 (Tree Depth). Let G = {V,E) be a graph with connected components {Gi)i^]. Its tree depth 
td(G) is 

'l if|V| = l, 

< l+min;,ev{td(G[y\{A-}])} if |y | > 1 and |/| = 1, 
max,g/{td(G;)} otherwise. 

We say that a class C of graphs has bounded tree depth if there exists a constant (i G N, such that td(G) < d 
for every G ^C. 

Examples of graphs of bounded tree depth are stars like S|» that have tree depth 2 via deleting the center 
vertex and producing independent sets like % °<> of tree depth 1 . A slightly more complicated example is 
with tree depth 3; this bound can be seen by deleting the vertex in the middle, which produces 
a graph whose components are stars. The vertex deletion process can also be interpreted as the task 
of finding a depth-first graph search tree of minimum possible height for the graph. Formally, this is 
captured by the following alternative definition of tree depth: The height of a rooted tree T = (y,E) is 
the length of a longest path from the root to a leaf. The closure of T is the graph with vertex set V that 
has edges between all vertices v E V and w € V that lie on some root-to-leaf path in T . By induction, one 
can show that the tree depth of a connected graph G is 1 plus the minimum possible height of a rooted 
tree whose closure contains G as a subgraph 1 14|. 

For any graph class C of bounded tree depth, Definition [15] states that every graph G gC can be split 
recursively into graphs of strictly decreasing tree depth by eliminating vertices x. The parallel splitting 
process, which ends after a constant number of steps, can be implemented using a first-order formula. 
In the following we will use the recursive definition of tree depth and combine it with Theorem [14] to 
evaluate GSO-formulas on graphs of bounded tree depth using first-order formulas. 

In order to get a handle on the components that arise during the elimination process, for a graph 
G = {V,E) and two different vertices x,s ^ ^{G), called the elimination vertex and the selector vertex, 
let us write Cx,s for the set of vertices in the component of G[y \ {x}] that contains s. Let us write Gx^s 
forG[Q,U{4]. 

Lemma 16. For every d there is a first-order formula (pd{x,s,y) such that for all graphs G with td{G) <d 
we have G \= q){x,s,y) if, and only if y (z V{Gx.s)- 

Proof The formula just has to test whether there is a path from s toy that does not go through x. Since 
in graphs of tree depth at most d all paths have length at most 2^ — 2, as shown in \,15L reachability can 
be defined for them using a first-order formula. □ 

In order to prove Theorem [TJ we prove the following lemma, where a colored graph is a graph that is 
accompanied by a finite number of monadic color relations. (Formally, a colored graph is a T-structure 
for a signature T = {E^,Cl,. . . ,C^}.) 

Lemma 17. Let d > I. For every GSO-formula on colored graphs there exists an FO-formula (p(^ j 
on colored graphs such that for every colored graph G with td(G) <dwe have G \= <j!)<j) ^ if and only if 
G^O. 

Proof. It will be convenient to prove the lemma's claim only for connected graphs G. This is no loss 
of generality since if G is not connected (which can be tested using a first-order formula for graphs of 
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bounded tree depth), we add a single new vertex to G that is connected to all vertices of G, arriving at a 
new graph G' in which the new vertex gets a new special color. We can then easily adjust the formula 
to a formula so that for all G' constructed in this way we have G if, and only if, G' \=^'. (The 
formula <!>' must just "ignore" the new vertex.) Note that td(G') = 1 +td(G). 

We now prove the claim by induction on d. For d = I, the only connected graph of tree depth 1 
consists of a single vertex. Thus, we can trivially replace by a formula (p^^ci as claimed. 

For the inductive step from d — I to d, let 

(p^p^d = ^{Vd{x) /\a{x)). 

Here, Yd{^) is a first-order formula that tests whether G[V \ {x}] has tree depth d — I. By definition of 
the tree depth, this must be true for at least one vertex x € V. 

Our objective is to adjust the formula i from Theorem [14] to form the formula a{x). Setting 
Bi = {x} and introducing a new color Bi, we can view G as a width- 1 rooted structure in the sense of 
Definition [121 Form the set / by picking one vertex from each component ofG[V\ {x}] and let G,- = Gx,i 
for / G /. Then F = {Gi)i(zj is a rooted partition of G in the sense of Definition [TS] 

Theorem[T4l tells us that Xq^°(F) \= '^=> G\=^. Thus, our objective is to setup a{x) is such a 
way that G \= a{x) -4=^ X™^°(F) \= a^,i. To achieve this, we modify a^ i so that we "simulate access 
to" the structure I^^°{F). 

Inside Oo i, we replace every occurrence of 3/(i/a), which quantifies over elements of the index set /, 
by = x) A 1/a), which quantifies over selector vertices of the graph G. We replace every occurrence 

of an equality test / = j by (pd{x,Si,Sj) from Lemma [T6l This formula verifies that si and sj select the 
same component of G[V\ {x}]. The tough part is replacing atoms T{i). Such an atom tests whether 
T = typeQ^°(G,) holds. By Lemma[6l the type of G, can be determined by testing G.v,.s, ^ H for a finite 
number of GSO-formulas Q.. 

For the test G,v,i, \= we cannot apply the induction hypothesis directly to G^,.,, since its tree depth 
is still d. Instead, let us write G^^. for the graph G[Cx,si] where we introduce a new color and color every 
vertex v S C^.^ . with this new color if there is an edge between x and v in G. This graph contains the same 
information as G^,.,, does, only the edges to x are now replaced by a coloring of the vertices. In particular, 
we can transform every formula Q. into a formula Q.' such that G^ .v, \= Q. <J=^ G^j \= Q.'. 

The graph G^^. has tree depth d — I and is connected, so we can apply the induction hypothesis to 
it. It states that for every GSO-formula ^l' there is an FO-formula (p^/ d-i such that G~^ \= (p^/ d-\ <^=^ 

Consider the formula from Lemma [6l By replacing each Q. with inside A7 we get an 

FO-formula (Ot such that G^^. \= COj <^=^ T = typeQ^°(G;t,i, ). As a final step, we modify (Ot to arrive at 
a new formula (Oj{x,Si) with the property G~^. |= ft)?- <:=^ G \= (Oj{x,Si). This last modification is easy 
to achieve: Inside (Oj, simply replace each quantifier ^y^^^f) by 3y{(p{x,Si,y) A -i(x = y) /\'^f) to ensure 
that y is picked from G^, . 

Putting it all together, starting from a^,\, we have now arrived at a formula a{x) with the property 
that T^'^qiF) ^ a^A holds if, and only if, G ^ a(x). □ 

5 Characterizing where FO and GSO Coincide On Graph Classes 
Closed Under Taking Induced Subgraphs 

We have already seen that FO =c MSO =c GSO holds for all classes C of graphs that have bounded tree 
depth. As we pointed out in the introduction in Theorem|2l this is "optimal" in the following sense: For 
any class C of graphs that is closed under taking subgraphs and that does not have bounded tree width, 
FO and MSO are not equally expressive on C. The reason is that if C contains graphs of arbitrarily large 
tree depth, by the second characterization of tree depth from the introduction, C will contain graphs in 
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which there are arbitrarily long paths. Since C is closed under taking subgraphs, these paths themselves 
are also elements of C and MSO can express that a path has even length, which FO cannot. 

Although it is a natural requirement for a class of graphs that it should be closed under taking sub- 
graphs, it is also a strong requirement. For instance, the quite natural classes of all complete graphs or the 
class of all complete bipartite graphs are not closed under taking subgraphs. A less strict requirement, 
which broadens the range of classes C that we can study, is to require only that the class is closed under 
induced subgraphs. This encompasses for instance the two just-mentioned classes. For classes C closed 
under taking induced subgraphs, it is no longer true that if C contains graphs of arbitrary tree depth, then 
FO 7^ MSO must hold; the class of all cliques, for instance, is a counterexample. 

In the present section we show that the tree depth of a class C that is closed under taking induced 
subgraphs is related to the question of whether FO =c GSO holds rather than on the question of whether 
FO =c MSO holds. Indeed, it is an open problem for which classes C of graphs closed under taking 
induced subgraphs FO =c MSO holds. We discuss this in the conclusion. 

The relationship between tree depth and FO =c GSO on classes closed under taking induced sub- 
graphs is summed up by Theorem [3] from the introduction. The theorem states that for every class C of 
graphs that is closed under taking induced subgraphs, we have FO =c GSO if, and only if, C has bounded 
tree depth. 

The if-direction was already proved in Section |4l For the only-if part, recall the argument that we 
used above for classes C that are closed under taking subgraphs: We argued that since C contains graphs 
containing arbitrarily long paths, C itself must contain all paths and MSO is more expressive than FO on 
paths. We wish to apply a similar argument now that C must only be closed under induced subgraphs, 
but it will no longer be the case that C will contain all paths as the examples of the class of all cliques 
and the class of all complete bipartite graphs show. Now, for these two examples GSO happens to be 
more expressive than FO since we can express that a clique or a complete bipartite graph has even size in 
GSO, but not in FO. But what happens when C contains neither all paths nor all cliques nor all complete 
bipartite graphs? 

Somewhat surprisingly, this cannot happen. We next prove a lemma that implies that every class of 
graphs that is closed under induced subgraphs and has unbounded tree depth, contains all paths or all 
complete graphs or all complete bipartite graphs. Thus, together with the fact that GSO is more expressive 
than FO on each of these classes, by proving this lemma, we show that Theorem [3] from the introduction 
holds. We believe that Lemma[T8]may be of independent interest. Its proof uses Ramsey's theorem. 

Lemma 18. For every k there is an n{k) such that every graph that contains an n{k)-vertex path as a 
subgraph contains or K^^k or a k-vertex path as an induced subgraph. 

Proof. We start by ruling out some trivial cases and fixing the terminology. The claim is trivial for ^ < 1 , 
so let k >2. Let G' be a graph that contains a path of length n > n{k) as a subgraph (we will fix n{k) 
later). We consider the graph G induced by the n-vertex path in G' . From this construction we know that 
G contains a Hamiltonian path. It will be convenient to denote paths by their sequence of vertices, that 
is, P = (vi , . . . , ) denotes the path with vertex set V{P) = {vi , . . . , v^} and edge set E{P) = {{v,-, v;+i } | 
/€ {1,...,£— 1}}. Let us write P\i\ for v,-. Since we can name the vertices arbitrarily, we may assume 
that V (G) = [n] = { 1 , . . . , «} holds and, since G has a Hamiltonian path, we may additionally assume that 
(1,2,3, ...,«) is this path. 

A path P = (vi , . . . , ) in G is increasing if vi < V2 < • • ■ < Vf . A path P from v to w is a shortest 
increasing path if it is increasing and if there is no shorter increasing path from v to w. For all v, w G V{G) 
with V < w we fix a shortest increasing path Pv,„. from v to w. Note that such a path exists because 
(v, V + 1, . . . ,w) is an increasing path from v to w. An important property of shortest increasing paths is 
that they are all induced paths of the graph G. In particular, if we find a shortest increasing path of length 
at least k, we are done. 
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For every 4-element subset {vi , V2, V3, V4} C V{G) with vi < V2 < V3 < V4 let 2({vi,V2,V3,V4}) = 
G[V(Pvi.v2) U V(/\,3,v4)] ■ This means that Q is the part of G induced by the vertices of the two shortest 
increasing paths Pvi,v2 and ^V3,V4- It will contain all edges on these paths and all edges between vertices 
on the one path and vertices on the other path (since the paths themselves are induced, there are no other 
edges in Q). In the following, we will call the vertices from Pvi,v2 the left vertices and the vertices from 
Pv3,v4 the right vertices of Q. 

If any Q({vi,V2,V3,V4}) has order more than 2k + 2, then at least one path must have length k 
and we are done. So, we may assume that |<2({vi,V2,V3,V4})| <2k + 2. 

Given graphs 2 ( { vi , V2 , V3 , V4 } ) and (2 ( {wi , W2 , W3 , W4 } ) , let us say that there is an order isomorphism 
between them if 

1. there is a graph isomorphism I : V(2({vi , V2,V3, V4})) — ?• y(2({H'i,H'2,H'3,H'4})), and 

2. I is order-preserving, that is, for i < j we have l{i) < and (c) l(v/) = w, for j G {1, . . . ,4}. 

Applying Ramsey's Theorem. Consider the set F of all 4-element subsets of V{G). We color its 
elements as follows: Two sets A, B € F get the same color if, and only if, there is an order isomorphism 
between Q{A) and Q{B). Note that, since the number of possible graphs Q depends only on k (since 
their sizes are at most 2k + 2), the number c of different colors that we need is bounded by a constant 
c{k) that depends only on k. Now, F is a set of 4-element subsets of a set V{G) of size n colored with 
at most c{k) colors. By Ramsey's theorem, the Ramsey number r(4,c(^),4^) has the property that if 
n > r{4,c{k),4k), then there is a subset M C V{G) of size \M\ > 4k such that all 4-element subsets of 
M have the same color. We choose n{k) = r{4,c{k),4k), so we know that such a monochromatic set M 
always exists inside our graph G. LetM = {vi,V2,V3, . . . ,V4k} with vi < • ■ • < V4k- Let us write for the 
path Pv„v,+ 1 for / G { 1 , . . . , 4^ — 1 } in the following, that is, for the path that leads from one vertex in M to 
the next. Note that the vertices of each P,- lie outside M, except for the first and last vertex. 

Since all 4-element subsets A of M have the same color, the graphs 2(A) are all order isomorphic to 
a single graph Q. Recall that in Q there is a left path and a right path. These two paths must have the 
same length since Q is order isomorphic to 2({vi,V2, V3,V4}) and also to 2({v3, V4, V5, vg}) (here, we use 
k>2). Because of the first isomorphism, Pv3,v4 is isomorphic to the right path of Q and because of the 
second isomorphism it is also isomorphic to the left path of Q, which must hence have the same length. 
Let / be this length and let us say that the first vertices of these paths are at position 1, the second vertices 
are at position 2, and so on up to position /. 

We will now make a case distinction depending on which edges are present between the vertices of 
these paths. Each case will lead to either an induced path of length k, an induced K^, or an induced K^^k 
inG. 

Case 1: No Edges Between Left and Right Path. First assume that there are no edges in Q between 
the vertices on the left path and the right path. We claim that in this case the distance between vi and 
V4fc in the graph H = G[|jf^7^ ^(^)] l^&^i 2k. For vertices from the ^QiV{Pi) there can be no edge 
in H to any vertex in one of the sets V{Pi) for i > 2 since such an edge would constitute an edge in 
G({vi,V2,v;,v,+i}) between the left path and the right path. Thus, starting from vi, to get to V41C inside 
H, we need to go through at least one vertex from V{P2). Next, we can argue in the same way that there 
is no edge from any vertex in ^(^2) to a vertex in any V{Pi) for i > 3. Thus, a path to v^k next needs to 
contain at least one vertex from ^(^3). Applying the same argument repeatedly shows that a path from 
vi to V4k in H must contain at least one vertex from each V (Z^) and, thus, must have length at least 2k. 

We have now seen that the distance from vi to V4k in H is at least 2k. On the other hand, there is an 
increasing path from vi to V4ic in H, namely the union of all Pi. In particular, there is a shortest increasing 
path and its length cannot be less than the distance. Thus, there is an induced path of length 2k in H and 
hence also in G. 

Case 2: An Edge Between Left and Right Path at the Same Position. We now consider the case that 
in Q there is an edge from a vertex on the left path to a vertex on the right path at the same position j. 
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We claim that in this case 



// = G[{/',-[j1 I '£{1,3,5,7,..., 2^-1}}] ^Kk. 

To see this, consider any two different vertices u = Pi[j] and v = Pji [j] of // for / + 1 < By assumption, 
in Q({v;,v;+i,v,',v//+i}) there is an edge between the vertices at position j and these are exactly u and v. 

Case 3: An Edge Between Left and Right Path at Different Positions. In this last case, we assume that 
neither the first nor the second case holds. Then there must be an edge between the left and right path 
at some positions ji and jr, but there are no edges between vertices at the same position, in particular, 
there are no edges between the vertices at position ji in the left and right paths nor between the vertices 
at position jr. We claim that in this case H = 



G 



|'/e{l,3,5,...,2^-l}}U 
~{Pi, [jr] I € {2^ + 1 , 2^ + 3 , . . . , 4^ - 1 } } 



k- 



To see this, first consider any two vertices u = Pi^ and v = Z^/ for // + 1 < among the first k vertices. 
Since there are no edge between the vertices at position ji in Q, neither is there an edge between these 
vertices in 2({v,pV,-,+i, v,/,v,/_,_i}) and, hence, there is no edges between u and v. In the same way, we 
see that there is no edge between two vertices Pi\jr] and P;/ [jr]. Finally, for any two vertices [ji] and 
Piriir] there is an edge, because there is one in !2({v;,,v,-,+i,v,v,v,v+i}) between the vertex at position ji 
on the left path and the vertex at position jr on the right path. □ 



6 Conclusion 

So far, studies of the expressive power of first-order and monadic second-order logics have been devoted 
to identifying classes of structures where MSO is more expressive than FO. For example, MSO on words 
can express exactly the regular languages while different kinds of FO express natural restrictions of regular 
languages. In the paper at hand we broadened this research by identifying classes of graphs where MSO 
and GSO coincide with FO, and give complete characterizations of where these logics coincide with FO 
for classes of graphs that satisfy natural closure conditions. 

We showed that on classes of graphs of bounded tree depth, FO, MSO, and GSO have the same 
expressive power and used this result to show that having bounded tree depth is a sufficient and necessary 
property for FO =c MSO =c GSO on classes C of graphs that are closed under taking subgraphs, and 
FO =c GSO on classes C of graphs that are closed under taking induced subgraphs. In our proofs we 
developed a composition theorem that shows how to compute the type of a structure from the types of 
an unbounded number of substructures using first-order formulas, and proved that any class of graphs of 
unbounded tree depth that is closed under taking induced subgraphs contains all paths or all cliques or 
all complete bipartite graphs. 

The main open question that remains is to give a characterization of where FO =c MSO holds for 
graph classes C closed under taking induced subgraphs. By considering the class C of cliques on which 
we have FO =c MSO, but unbounded tree depth, one can see that bounding the tree depth does not lead 
to a complete characterization in this case. One idea is to develop an adjusted notion of clique width that 
has the same relation to clique width as tree depth has to tree width. 

Another direction is to consider other kinds of monadic second-order logics like MSO with parity 
predicates, which test whether bound relations have even cardinality. For such logics the classical con- 
structive composition theorem that combines a bounded number of substructures using propositional 
formulas still works, but to compute the type of a structure from the types of an unbounded number of 
substructures first-order is not enough. Which kind of first-order logics do we need to combine the types 
of which kinds of monadic second-order logics? Answers to this question prove composition theorems 
that can be used to show equal expressibility on bounded tree depth graphs. 
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